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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Reyes, Gregory R 

Yarbough, Patrice O 
Bradley, Daniel W 
Krawczynski, Krzysztof Z 
Tarn, Albert 
Fry, Kirk E 

(ii) TITLE OF INVENTION: DNA Sequences of Enterically Transmitted 
Non-A/Non-B Hepatitis Viral Agent 

(iii) NUMBER OF SEQUENCES: 20 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Dehlinger & Associates 

(B) STREET: 350 Cambridge Avenue, Suite 250 

(C) CITY: Palo Alto 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP: 94306 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B> COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

CD) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 09/128,275 

(B) FILING DATE: 03-AUG-1998 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/279,823 
.(B) FILING DATE: 25-JUL-1994 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/681,078 

(B) FILING DATE: 05-APR-1991 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/505,888 

(B) FILING DATE: 05-APR-1990 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/420,921 

(B) FILING DATE: 13-OCT-1989 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/367,486 

(B) FILING DATE: 16-JUN-198 9 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/336,672 

(B) FILING DATE: ll-APR-1989 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/208,997 

(B) FILING DATE: 17-JUN-1988 
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(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Petithory, Joanne R. 

(B) REGISTRATION NUMBER: 42,995 

(C) REFERENCE /DOCKET NUMBER: 4 600-0183.24 

Cix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (650) 324-0880 

(B) TELEFAX : (650) 324-0960 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1295 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 1.33 kb EcoRI insert of ET1.1, 
forward sequence 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..12 93 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 1294 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 3.. 1295 

(xi) SEQUENCE DESCRIPTION: SEQ IDNOrl: 

AGACCTGTCC CTGTTGCAGC TGTTCTACCA CCCTGCCCCG AGCTCGAACA GGGCCTTCTC 60 

TACCTGCCCC AGGAGCTCAC CACCTGTGAT AGTGTCGTAA CATTTGAATT AAC AG AC AT T 120 

GTGCACTGCC GCATGGCCGC CCCGAGCCAG CGCAAGGCCG TGCTGTCCAC ACTCGTGGGC 180 

CGCTACGGCG GTCGCACAAA GCTCTACAAT GCTTCCCACT CTGATGTTCG CGACTCTCTC 24 0 

GCCCGTTTTA TCCCGGCCAT TGGCCCCGTA CAGGTTACAA CTTGTGAATT GTACGAGCTA 300 

GTGGAGGCCA TGGTCGAGAA GGGCCAGGAT GGCTCCGCCG TCCTTGAGCT TGATCTTTGC 360 

AACCGTGACG TGTCCAGGAT CACCTTCTTC CAGAAAGATT GTAACAAGTT CACCACAGGT 420 

GAGACCATTG CCCATGGTAA AGTGGGCCAG GGCATCTCGG CCTGGAGCAA GACCTTCTGC 4 80 

GCCCTCTTTG GCCCTTGGTT CCGCGCTATT GAGAAGGCTA TTCTGGCCCT GCTCCCTCAG 540 

GGTGTGTTTT ACGGTGATGC CTTTGATGAC ACCGTCTTCT CGGCGGCTGT GGCCGCAGCA 600 
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AAGGCATCCA 


TGGTGTTTGA 


GAATGACTTT 


TCTGAGTTTG 


ACTCCACCCA 


GAATAACTTT 


660 


TCTCTGGGTC 


TAGAGTGTGC 


TATTATGGAG 


GAGTGTGGGA 


TGCCGCAGTG 


GCTCATCCGC 


720 


CTGTATCACC 


TTATAAGGTC 


TGCGTGGATC 


TTGCAGGCCC 


CGAAGGAGTC 


TCTGCGAGGG 


780 


TTTTGGAAGA 


AACACTCCGG 


TGAGCCCGGC 


ACTCTTCTAT 


GGAATACTGT 


CTGGAATATG 


840 


GCCGTTATTA 


CCCACTGTTA 


TGACTTCCGC 


GATTTTCAGG 


TGGCTGCCTT 


TAAAGGTGAT 


900 


GATTCGATAG 


TGCTTTGCAG 


TGAGTATCGT 


CAGAGTCCAG 


GAGGTGCTGT 


CCTGATCGCC 


960 


GGCTGTGGCT 


TGAAGTTGAA 


GGTAGATTTC 


CGCCCGATCG 


GTTTGTATGC 


AGGTGTTGTG 


1020 


GTGGCCCCCG 


GCCTTGGCGC 


GCTCCCTGAT 


GTTGTGCGCT 


TCGCCGGCCG 


GCTTACCGAG 


1080 


AAGAATTGGG 


GCCCTGGCCC 


TGAGCGGGCG 


GAGCAGCTCC 


GCCTCGCTGT 


TAGTGATTTC 


1140 


CTCCGCAAGC 


TCACGAATGT 


AGCTCAGATG 


TGTGTGGATG 


TTGTTTCCCG 


TGTTTATGGG 


1200 


GTTTCCCCTG 


GACTCGTTCA 


TAACCTGATT 


GGCATGCTAC 


AGGCTGTTGC 


TGATGGCAAG 


1260 


GCACATTTCA 


CTGAGTCAGT 


AAAACCAGTG 


CTCGA 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 431 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Arg Pro Val Pro Val Ala Ala Val Leu Pro Pro Cys Pro Glu Leu Glu 
1 , 5 10 15 

Gin Gly Leu Leu Tyr Leu Pro Gin Glu Leu Thr Thr Cys Asp Ser Val 
20 25 30 

Val Thr Fhe Glu Leu Thr Asp lie Val His Cys Arg Met Ala Ala Pro 
35 40 45 

Ser Gin Arg Lys Ala Val Leu Ser Thr Leu Val Gly Arg Tyr Gly Gly 
50 55 60 

Arg Thr Lys Leu Tyr Asn Ala Ser His Ser Asp Val Arg Asp Ser Leu 
65 70 75 80 

Ala Arg Phe He Pro Ala He Gly Pro Val Gin Val Thr Thr Cys Glu 
85 90 95 

Leu Tyr Glu Leu Val Glu Ala Met Val Glu Lys Gly Gin Asp Gly Ser 
100 105 110 

Ala Val Leu Glu Leu Asp Leu Cys Asn Arg Asp Val Ser Arg He Thr 
115 120 125 

Phe Phe Gin Lys Asp Cys Asn Lys Phe Thr Thr Gly Glu Thr He Ala 
130 135 140 
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His Gly Lys Val Gly Gin Gly He Ser Ala Trp Ser Lys Thr Phe Cys 
145 150 155 160 

Ala Leu Phe Gly Pro Trp Phe Arg Ala He Glu Lys Ala lie Leu Ala 
165 170 . 175 

Leu Leu Pro Gin Gly Val Phe Tyr Gly Asp Ala Phe Asp Asp Thr Val 
180 185 190 

Phe Ser Ala Ala Val Ala Ala Ala Lys Ala Ser Met Val Phe Glu Asn 
195 200 205 

Asp Phe Ser Glu Phe Asp Ser Thr Gin Asn Asn Phe Ser Leu Gly Leu 
210 215 220 

Glu Cys Ala He Met Glu Glu Cys Gly Met Pro Gin Trp Leu He Arg 
225 230 235 240 

Leu Tyr His Leu He Arg Ser Ala Trp He Leu Gin Ala Pro Lys Glu 
245 250 255 

Ser Leu Arg Gly Phe Trp Lys Lys His Ser Gly Glu Pro Gly Thr Leu 
260 265 270 

Leu Trp Asn Thr Val Trp Asn Met Ala Val He Thr His Cys Tyr Asp 
275 280 285 

Phe Arg Asp Phe Gin Val Ala Ala Phe Lys Gly Asp Asp Ser lie Val 
290 295 300 

Leu Cys Ser Glu Tyr Arg Gin Ser Pro Gly Ala Ala Val Leu He Ala 
305 310 315 320 

Gly Cys Gly Leu Lys Leu Lys Val Asp Phe Arg Pro He Gly Leu Tyr 
325 330 335 

Ala Gly Val Val Val Ala Pro Gly Leu Gly Ala Leu Pro Asp Val Val 
340 345 350 

Arg Phe Ala Gly Arg Leu Thr Glu Lys Asn Trp Gly Pro Gly Pro Glu 
355 360 365 

Arg Ala Glu Gin Leu Arg Leu Ala Val Ser Asp Phe Leu Arg Lys Leu 
370 375 380 

Thr Asn Val Ala Gin Met Cys Val Asp Val Val Ser Arg Val Tyr Gly 
385 390 395 400 

Val Ser Pro Gly Leu Val His Asn Leu He Gly Met Leu Gin Ala Val 
405 410 415 

Ala Asp Gly Lys Ala His Phe Thr Glu Ser Val Lys Pro Val Leu 
420 425 430 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

CO INDIVIDUAL ISOLATE: linker - top (5 f ) sequence 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 3: 

GGAATTCGCG GCCGCTCG 1; 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: linker - bottom (3') sequence 

, (xi) SEQUENCE DESCRIPTION: SEQ ID' NO: 4: 

CGAGCGGCCG CGAATTCCTT 2 C 

(2) INFORMATION FOR SEQ ID NO:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1295 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 1.33 kb EcoRI insert of ET1.1, 
reverse sequence 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
TCGAGCACTG GTTTTACTGA CTCAGTGAAA TGTGCCTTGC CATCAGCAAC AGCCTGTAGC GO 
ATGCCAATCA GGTTATGAAC GAGTCCAGGG GAAACCCCAT AAACACGGGA AACAACATCC 120 
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ACACACATCT 


GAGCTACATT 


CGTGAGCTTG 


CGGAGGAAAT 


CACTAACAGC 


GAGGCGGAGC 


180 


TGCTCCGCCC 


GCTCAGGGCC 


AGGGCCCCAA 


TTCTTCTCGG 


TAAGCCGGCC 


GGCGAAGCGC 


240 


ACAACATCAG 


GGAGCGCGCC 


AAGGCCGGGG 


GCCACCACAA 


CACCTGCATA 


CAAACCGATC 


300 


GGGCGGAAAT 


CTACCTTCAA 


CTTCAAGCCA 


CAGCCGGCGA 


TCAGGACAGC 


AGCTCCTGGA 


360 


CTCTGACGAT 


ACTCACTGCA 


AAGCACTATC 


GAATCATCAC 


CTTTAAAGGC 


AGCCACCTGA 


420 


AAATCGCGGA 


AGTCATAACA 


GTGGGTAATA 


ACGGCCATAT 


TCCAGACAGT 


ATTCCATAGA 


480 


AGAGTGCCGG 


GCTCACCGGA 


GTGTTTCTTC 


CAAAACCCTC 


GCAGAGACTC 


CTTCGGGGCC 


540 


TGCAAGATCC 


ACGCAGACCT 


TATAAGGTGA 


TACAGGCGGA 


TGAGCCACTG 


CGGCATCCCA 


600 


CACTCCTCCA 


TAATAGCACA 


CTCTAGACCC 


AG AG AAAAG T 


TATTCTGGGT 


GGAGTCAAAC 


660 


TCAGAAAAGT 


CATTCTCAAA 


CACCATGGAT 


GCCTTTGCTG 


CGGCCACAGC 


CGCCGAGAAG 


720 


ACGGTGTCAT 


CAAAGGCATC 


ACCGTAAAAC 


ACACCCTGAG 


GGAGCAGGGC 


CAGAATAGCC 


780 


TTCTCAATAG 


CGCGGAACCA 


AGGGCCAAAG 


AGGGCGCAGA 


AGGTCTTGCT 


CCAGGCCGAG 


840 


ATGCCCTGGC 


CCACTTTACC 


ATGGGCAATG 


GTCTGACCTG 


TGGTGAACTT 


GTTACAATCT 


900 


TTCTGGAAGA 


AGGTGATCCT 


GGACACGTCA 


CGGTTGCAAA 


GATCAAGCTC 


AAGGACGGCG 


960 


GAGCCATCCT 


GGCCCTTCTC 


GACCATGGCC 


TCCACTAGCT 


CGTACAATTC 


ACAAGTTGTA 


1020 


ACCTGTACGG 


GGCCAATGGC 


CGGGATAAAA 


CGGGCGAGAG 


AGTCGCGAAC 


ATCAGAGTGG 


1080 


GAAGCATTGT 


AGAGCTTTGT 


GCGACCGCCG 


TAGCGGCCCA 


CGAGTGTGGA 


CAGCACGGCC 


1140 


TTGCGCTGGC 


TCGGGGCGGC 


CATGCGGCAG 


TGCACAATGT 


CTGTTAATTC 


AAATGTTACG 


1200 


ACAC TAT C AC 


AGGTGGTGAG 


GTCCTGGGGC 


AGGTAGAGAA 


GGCCCTGTTC 


GAGCTCGGGG 


1260 


CAGGGTGGTA 


GAACAGCTGC 


AACAGGGACA 


GGTCT 
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(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7195 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE : NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: HEV - Burma strain 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 28.. 5106 
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(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 5147.. 7126 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 5106.. 5474 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



AGGCAGACCA 


CATATGTGGT 


CGATGCCATG 


GAGGCCCATC 


AGTTTATTAA 


GGCTCCTGGC 


60 


ATCACTACTG 


CTATTGAGCA 


GGCTGCTCTA 


GCAGCGGCCA 


ACTCTGCCCT 


GGCGAATGCT 


120 


GTGGTAGTTA 


GGCCTTTTCT 


CTCTCACCAG 


CAGATTGAGA 


TCCTCATTAA 


CCTAATGCAA 


180 


CCTCGCCAGC 


TTGTTTTCCG 


CCCCGAGGTT 


TTCTGGAATC 


ATCCCATCCA 


GCGTGTCATC 


240 


CATAACGAGC 


TGGAGCTTTA 


CTGCCGCGCC 


CGCTCCGGCC 


GCTGTCTTGA 


AATTGGCGCC 


300 


CATCCCCGCT 


CAATAAATGA 


TAATCCTAAT 


GTGGTCCACC 


GCTGCTTCCT 


CCGCCCTGTT 


360 


GGGCGTGATG 


TTCAGCGCTG 


GTATACTGCT 


CCCACTCGCG 


GGCCGGCTGC 


TAATTGCCGG 


420 


CGTTCCGCGC 


TGCGCGGGCT 


TCCCGCTGCT 


GACCGCACTT 


ACTGCCTCGA 


CGGGTTTTCT 


480 


GGCTGTAACT 


TTCCCGCCGA 


GACTGGCATC 


GCCCTCTACT 


CCCTTCATGA 


TATGTCACCA 


540 


TCTGATGTCG 


CCGAGGCCAT 


GTTCCGCCAT 


GGTATGACGC 


GGCTCTATGC 


CGCCCTCCAT 


600 


CTTCCGCCTG 


AGGTCCTGCT 


GCCCCCTGGC 


ACATATCGCA 


CCGCATCGTA 


TTTGCTAATT 


660 


CATGACGGTA 


GGCGCGTTGT 


GGTGACGTAT 


GAGGGTGATA 


CTAGTGCTGG 


TTACAACCAC 


720 


GATGTCTCCA 


ACTTGCGCTC 


CTGGATTAGA 


ACCACCAAGG 


TTACCGGAGA 


CCATCCCCTC 


780 


GTTATCGAGC 


GGGTTAGGGC 


CATTGGCTGC 


CACTTTGTTC 


TCTTGCTCAC 


GGCAGCCCCG 


840 


GAGCCAT CAC 


CTATGCCTTA 


TGTTCCTTAC 


CCCCGGTCTA 


CCGAGGTCTA 


TGTCCGATCG 


900 


ATCTTCGGCC 


CGGGTGGCAC 


CCCTTCCTTA 


TTCCCAACCT 


CATGCTCCAC 


TAAGTCGACC 


960 


TTCCATGCTG 


TCCCTGCCCA 


TATTTGGGAC 


CGTCTTATGC 


TGTTCGGGGC 


CACCTTGGAT 


1020 


GACCAAGCCT 


TTTGCTGCTC 


CCGTTTAATG 


ACCTACCTTC 


GCGGCATTAG 


CTACAAGGTC 


1080 


ACTGTTGGTA 


CCCTTGTGGC 


TAATGAAGGC 


TGGAATGCCT 


CTGAGGACGC 


CCTCACAGCT 


1140 


GTTATCACTG 


CCGCCTACCT 


TACCATTTGC 


CACCAGCGGT 


ATCTCCGCAC 


CCAGGC TATA 


1200 


TCCAAGGGGA 


TGCGTCGTCT 


GGAACGGGAG 


CATGCCCAGA 


AGTTTATAAC 


ACGCCTCTAC 


1260 


AGCTGGCTCT 


TCGAGAAGTC 


CGGCCGTGAT 


TACATCCCTG 


GCCGTCAGTT 


GGAGTTCTAC 


1320 


GCCCAGTGCA 


GGCGCTGGCT 


CTCCGCCGGC 


TTTCATCTTG 


ATCCACGGGT 


GTTGGTTTTT 


1380 


GACGAGTCGG 


CCCCCTGCCA 


TTGTAGGACC 


GCGATCCGTA 


AGGCGCTCTC 


AAAGTTTTGC 


1440 


TGCTTCATGA 


AGTGGCTTGG 


TCAGGAGTGC 


ACCTGCTTCC 


TTCAGCCTGC 


AGAAGGCGCC 


1500 


GTCGGCGACC 


AGGGTCATGA 


TAATGAAGGC 


TATGAGGGGT 


CCGATGTTGA 


CCCTGCTGAG 


1560 
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XbbbbbAX 1H 


ptpapatatp 
b 1 b/ib A Init 


TPPPTPPTflT 
1 bob 1 L-o 1 £\ 1 


blOblboo lb 


C H S CT P C CH T 


L^HHL,L-bb X b 


1 (Con 


X AbbAbbbbb 


T P P a TP TP P P 
1 bbAX b 1 


bb^ 1 bAbA I 1 


PT'PPPTPPJPP 
0 1 bub I 00^-0 


oboo^^ooo 1 


p zxppppp a n n 


1 con 

1 bo U 


b X AAAbb XbX 


ppp 7a pptpp a 

bbCAbb 1 bbA 


1 bbbbbbAI b 


bAl 1 bbbHu/i 


PPPTTPTTPP 
bbo 1 X b i. i 00 


TIAZiPZiaA Tiff 
X ruAbi-iAAAbb 


1 7/j p, 

X. / 4 U 


t t t rr* p 7a rr t 

1 1 XbooMbbX 


p p T T PT T TP 21 
bb X I bo l 


r* H H H C'C 1 CH T C* 
Luuuijl/uu I b 


t t zx p* zx r* z\ r* p z\ 


aTHP^PPPZlPa 


oLu'v L-/\b/\A X 


X C3 UU 


ft rp /*-* rn ,r* ft rrt rp /-t 


A I ubbfib i bh 


b Abb Ab I A I b 


bbbbb 1 ubub 


PTTTPTaPITPT 


PTiPPTATPPP 
U^i^b X AX bbb 


1 0 bb) 


bbb ILi bbAb 


b 1 bbbb X bbA 


bb I bbbb L A X 


c t t p p t p** p r* 

bX lbbibbL-b 


PP!PTTPZiPP2i 
bbb X X bAbbA 


rnr^f^/^f^ftftftrnm 
X Xbbbbbb X X 


1 QOA 


rp rp rn f ft ft ft ft /*»» ft 

r r xgcccccg 


bX GX X XCACC 


bbGG X bAbbb 


bbbbbbbAbb 


X X Abbbbb X X 


firpftftrrtfimftftft 

b X bb XbX bbb 


x you 


C i. AX ACAbGX 


I 1 AACCb 1 GA 


bbbbbAbbbb 


CAX Xbbb XbA 


Tf~*PPT7\ npT"P 

X bbbX AAb X X 


7\rpf<f<rnmfifi7\rn 

AXbbX XCCAX 




C C T G AGG G AC 


ICAI X bbbbX 


GX XbGGbbbG 


rp rp rp rp /-* f t~* /-» ^ 

XXX Xbbbbbb 


bbbAXb I X Xb 


bb Ab 1 bbbb 1 


01 Ar\ 
/XUU 


AATCCATTCT 


GTGGCGAGAG 


C AC AC a X TAC 


ACCCGTACX X 


f f~* r~* 7\ /-"• f*"' rp 

GGXCGGAGGT 


1 bAXGGCGTC 


2160 


TGTAGTCCAG 


/-*\ jr-> y^t /*-* /^t y^>t rn t\ 

CCCGGCCTGA 


CTTAGGTTTT 


ATGTCTGAGC 


CTTCTATACC 


m 7\ f rn t\ f* ft ft ft ft 

TAGTAGGGCC 


2220 


GCCACGCCTA 


CCCTGGCGGC 


CCCTCTACCC 


CCCCCTGCAC 


CGGACCCTTC 


y*^ y*^ rrt z^*** 

CCCCCCTCCC 


2280 


TCTGCCCCGG 


CGCTTGCTGA 


GCGGGCTTCT 


GGCGCTACCG 


ft f ft ft ft f* ft ft ft 
CCGGGGCCCC 


ft ft ft ft tv rrt -n ti /~»m 

GGCCATAACT 


234 0 


CACCAGACGG 


CCCGGCACCG 


CCGCCTGCTC 


TTCACCTACC 


CGGATGGCTC 


TAAGGTATTC 


2400 


GCCGGCTCGC 


TGTTCGAGTC 


GACATGCACG 


TGGCTCGTTA 


ACGCGTCTAA 


TGTTGACCAC 


24 60 


CGCCCTGGCG 


GCGGGCTTTG 


CCATGCATTT 


TACCAAAGGT 


ACCCCGCCTC 


CTTTGATGCT 


2520 


GCCTCTTTTG 


TGATGCGCGA 


CGGCGCGGCC 


GCGTACACAC 


mix Tt ftft ft ft ft f* 

TAACCCCCCG 


ftft t\ t\ m 7\ 7\ m rp 

GCC AAT AAT T 


O C O f\ 

zoo 0 


CACGCTGTCG 


CCCCTGATTA 


TAGGTTGGAA 


C ATAACC CAA 


AGAGGC X X GA 


1^ rp f rp rp 71 rp 

bGCXGCX XAX 


zd4U 


CGGGAAACTT 


y-^m ft f* ft ft ft frr\ 

GCTCCCGCCT 


r*t f~* tv /r*^ /—» /~i m 

CGGCACCGCT 


GCATACCCGC 


rn ft ft rr\ ft f* ft ft -?\ ft 

XCGXCGGGAC 


CbbbA XAX AC 


Z / UU 


CAGGTGCCGA 


TCGGCCCCAG 


T X TTGACGCC 


1 GGGAbCGGA 


ACCACCGCCG 


CbbbbAX bAb 


*i / oU 


m rp /"■» rn ~n rrt rr*/"* 

1 IGTACCTTC 


CTGAGCTTGC 


TGCCAGATGG 


XXX GAbbbGA 


AXAbbbbbAb 


CCbCCCbAb X 


^.o^U 


CTCACTATAA 


CTGAGGA1 GT 


X GCAGGGAGA 


bbbAAX bX bb 


CCA X CbAbb X 


TP A PTP A PPT 

X bAb X bAbbb 


0 U 


7\ P 7\ P A TipmPP 

ACAbA I b T Cb 


GGCbbbCC 1 G 


i bbbbbb X b X 


bbbb X bAbbb 


bbbobbl XbX 


TPTiPTAPPUP 
X bAb X rtL^Oi-io 




rprpm Tt ft m f ft 7\ /"""» 

X X 1ACTGCAG 


GTGIGGGIGG 


AX bbGGGAAG 


rp O /-* r* P T* rp 7\ 

X CCCbb X b X A 


X CAbbbAAbb 


PP A TPTPP A P 

bbA XbX bbAb 


^Pi nn 


bi ibl bGX GG 


1 G GG bAC br b br 


mr'TV < -, rprp|r-» ) r-»/-'rn 

X bAb X X bbb X 


AA X bbb X bbb 


bbbb X bbbbb 


p rp rp rp /— /-> rn ft (~, m 

Li, I 1 uU X uL X 


juou 


TTTACCCCGC 


ATACTGCCGC 


CAGAGTCACC 


CAGGGGCGCC 


GGGTTGTCAT 


TGATGAGGCT 


3120 


CCATCCCTCC 


CCCCTCACCT 


GCTGCTGCTC 


CACATGCAGC 


GGGCCGCCAC 


CGTCCACCTT 


3180 


CTTGGCGACC 


CGAACCAGAT 


CCCAGCCATC 


GACTTTGAGC 


ACGCTGGGCT 


CGTCCCCGCC 


3240 


ATCAGGCCCG 


ACTTAGGCCC 


CACCTCCTGG 


TGGCATGTTA 


CCCATCGCTG 


GCCTGCGGAT 


3300 


GTATGCGAGC 


TCATCCGTGG 


T'GCATACCCC 


AT GAT CC AG A 


CCACTAGCCG 


GGTTCTCCGT 


3360 


TCGTTGTTCT 


GGGGTGAGCC 


TGCCGTCGGG 


CAGAAACTAG 


TGTTCACCCA 


GGCGGCCAAG 


3420 
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CCCGCCAACC 


CCGGCTCAGT 


GACGGTCCAC 


GAGGCGCAGG 


GCGCTACCTA 


CACGGAGACC 


3480 


ACTATTATTG 


CCACAGCAGA 


TGCCCGGGGC 


CTTATTCAGT 


CGTCTCGGGC 


TCATGCCATT 


3540 


GTTGCTCTGA 


CGCGCCACAC 


TGAGAAGTGC 


GTCATCATTG 


ACGCACCAGG 


CCTGCTTCGC 


3600 


GAGGTGGGCA 


TCTCCGATGC 


AATCGTTAAT 


AACTTTTTCC 


TCGCTGGTGG 


CGAAATTGGT 


3660 


CACCAGCGCC 


CATCAGTTAT 


TCCCCGTGGC 


AACCCTGACG 


CCAATGTTGA 


CACCCTGGCT 


3720 


GCCTTCCCGC 


CGTCTTGCCA 


GATTAGTGCC 


TTCCATCAGT 


TGGCTGAGGA 


GCTTGGCCAC 


3780 


AGACCTGTCC 


CTGTTGCAGC 


TGTTCTACCA 


CCCTGCCCCG 


AGCTCGAACA 


GGGCCTTCTC 


3840 


TACCTGCCCC 


AGGAGCTCAC 


CACCTGTGAT 


AGTGTCGTAA 


CATTTGAATT 


AACAGACATT 


3900 


GTGCACTGCC 


GCATGGCCGC 


CCCGAGCCAG 


CGCAAGGCCG 


TGCTGTCCAC 


ACTCGTGGGC 


3960 


CGCTACGGCG 


GTCGCACAAA 


GCTCTACAAT 


GCTTCCCACT 


CTGATGTTCG 


CGACTCTCTC 


4020 


GCCCGTTTTA 


TCCCGGCCAT 


TGGCCCCGTA 


CAGGTTACAA 


CTTGTGAATT 


GTACGAGCTA 


4080 


GTGGAGGCCA 


TG&TCGAGAA 


GGGCCAGGAT 


GGCTCCGCCG 


TCCTTGAGCT 


TGATCTTTGC 


4140 


AACCGTGACG 


TGTCCAGGAT 


CACCTTCTTC 


CAGAAAGATT 


GTAACAAGTT 


CACCACAGGT 


4200 


GAGACCATTG 


CCCATGGTAA 


AGTGGGCCAG 


GGCATCTCGG 


CCTGGAGCAA 


GACCTTCTGC 


4260 


GCCCTCTTTG 


GCCCTTGGTT 


CCGCGCTATT 


GAGAAGGCTA 


TTGTGGCCCT 


GCTCCCTCAG 


4320 


GGTGTGTTTT 


ACGGTGATGC 


CTTTGATGAC 


ACCGTCTTCT 


CGGCGGCTGT 


GGCCGCAGCA 


4380 


AAGGCATCCA 


TGGTGTTTGA 


GAATGACTTT 


TCTGAGTTTG 


ACTCCACCCA 


GAATAACTTT 


4440 


TCTCTGGGTC 


TAGAGTGTGC 


TAT TAT GG AG 


GAGTGTGGGA 


TGCCGCAGTG 


GCTCATCCGC 


4500 


CTGTATCACC 


TTATAAGGTC 


TGCGTGGATC 


TTGCAGGCCC 


CGAAGGAGTC 


TCTGCGAGGG 


4560 


TTTTGGAAGA 


AACACTCCGG 


TGAGCCCGGC 


ACTCTTCTAT 


GGAATACTGT 


CTGGAATATG 


4620 


GCCGTTATTA 


CCCACTGTTA 


TGACTTCCGC 


GATTTTCAGG 


TGGCTGCCTT 


TAAAGGTGAT 


4680 


GATTCGATAG 


TGCTTTGCAG 


TGAGTATCGT 


CAGAGTCCAG 


GAGCTGCTGT 


CCTGATCGCC 


4740 


GGCTGTGGCT 


TGAAGTTGAA 


GGTAGATTTC 


CGCCCGATCG 


GTTTGTATGC 


AGGTGTTGTG 


4800 


GTGGCCCCCG 


GCCTTGGCGC 


GCTCCCTGAT 


GTTGTGCGCT 


TCGCCGGCCG 


GCTTACCGAG 


4860 


AAGAATTGGG 


GCCCTGGCCC 


T.GAGCGGGCG 


GAGCAGCTCC 


GCCTCGCTGT 


TAGTGATTTC 


4920 


C T r C C C A A a c 




zirT"T , r*zir'7i r nr* 
Hot 1 o 






TGTTTA1GGG 


4 980 


GTTTCCCCTG 


GACTCGTTCA 


TAACCTGATT 


GGCATGCTAC 


AGGCTGTTGC 


TGATGGCAAG 


5040 


GCACATTTCA 


CTGAGTCAGT 


AAAACCAGTG 


CTCGACTTGA 


CAAATTCAAT 


CTTGTGTCGG 


5100 


GTGGAATGAA 


TAACATGTCT 


TTTGCTGCGC 


CCATGGGTTC 


GCGACCATGC 


GCCCTCGGCC 


5160 


TATTTTGTTG 


CTGCTCCTCA 


TGTTTTTGCC 


TATGCTGCCC 


GCGCCACCGC 


CCGGTCAGCC 


5220 


GTCTGGCCGC 


CGTCGTGGGC 


GGCGCAGCGG 


CGGTTCCGGC 


GGTGGTTTCT 


GGGGTGACCG 


5280 
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GGTTGATTCT 


CAGCCCTTCG 


CAATCCCCTA 


TATTCATCCA 


ACCAACCCCT 


TCGCCCPPPA 




TGTCACCGCT 


GCGGCCGGGG 


CTGGACCTCG 


TGTTCGCCAA 


CCCGCCCGAC 


CACTCGGCTC 


~J *_ vj 


CGCTTGGCGT 


GACCAGGCCC 


AGCGCCCCGC 


CGTTGCCTCA 


CGTCGTAGAC 


CTACCAPAPrP 


fin 


TGGGGCCGCG 


CCGCTAACCG 


CGGTCGCTCC 


GGCCCATGAC 


ACCCCGCCAG 


TGCCTGATHT 


~J «_* <_ w 


CGACTCCCGC 


GGCGCCATCT 


TGCGCCGGCA 


GTATAACCTA 


TCAACATCTC 


PPPTTAPPTP 

V_rV_rV_r X X _AV_< V-^ X V_< 


_J -J o u 


TTCCGTGGCC 


ACCGGCACTA 


ACCTGGTTCT 


TTATGCCGCC 


CCTCTTAGTP 


PPPTTTTAPP 

v_*V_lv_. X X X Irt^v/ 




CCTTCAGGAC 


GGCACCAATA 


CCCATATAAT 


GGCCACGGAA 


GCTTCTAATT 

VJ X J. —x X Oii X X 


ATPPPPAPTA 

r\ X v_fv_-v_.v__-5.v_t Irt 


_j / u u 


CCGGGTTGCC 


CGTGCCACAA 


TCCGTTACCG 


CCCGCTGGTP 

W v^VJ V X X V*- 


PPPAATPPTP 


TPPPPPPTTA 

X v_.v_iv_rv_.v_rv_f X X r\ 


^7 fin 


CGCCATCTCC 


ATCTCATTCT 


GGCCACAGAC 


PACCAPPACC 

^wfiv \> 


PPGAPGTPPP 


x x \ar\ x _-\ x v_r_-i_-\ 


vJOZU 


TTCAATAACC 


TCGACGGATG 

J* V> VJ JTi. Vw> VJ V_JXi J. VJ 


TTCGTATTTT 

X X V-- VJ X XTx. X X X X 


APTPPAPPPP 


PPPATAPPPT 
v_iv_iv_-_ _ vx nuu^ x 


PTP APPTTPT 
v_- 1 vji-iVav^. I 1 Lj 1 


CpQA 

DooU 


GATCCCAAGT 


GAGPGPPTAP 

vj_txv_i v_< v_- x 


APTATPPTAA 


PPAAPPPTPP 


PPPTPPPTPP 


ZiPZ\PPTPTPP 




GGTGGPTPAP, 


PAPPAPPPTA 


PPTPTPPTPT 

V_» W 1 X v_- X 


TPTT A TPPHPHP 
X vj x irtl v_jv_. X X 


T P P A V Z_ P Z_ T P 


t_Jv_. X v_-/iv_- 1 L, vJ i 


oUUU 


AAATTPPTAT 

rrriri x x v_> w J, n. X 


APTAATAPAP 


PPTATAPPPP 

v_-v^ 1 x _-_v_-v_. vjv_r 


t pp p p t p p pp 


k^ 1 o I X uuHL. X 


1 X bL-LL, I 1 v_r_-\ 




PPTTPAPTTT 


PPPAAPPTTA 




p z\ rr a a t a 

v_._-i.v_, v_-_~l_-_ I va 




r ,r r r i i A r P r T i pr , AP 
bri lAi iL-UAvj 


„1 OA 


paptpptppp 


PAPPPPPTTP 


PTPPPr^PTPP 

vj X V_.v_fV_.V_Iv_l X UL. 


PPAPPPPAPT 
o un- vj vj vj_-_\_. x 


ppppi\pp r pr , n 


V_ \_._-_ v_. V_.iT.V_. v_t v_t v_- 


DioU 


tgctappppp 


XXX n x orifivjvj 


APPTPT ATTT 


t apt apt apt* 

X _-_v_, X X r_v_. X 


L\ 2_ T PPT P T P P 


v_t I KjH\J£\ X v<vj\3 




CGGCGGGATA 

V-/ V_* V4 V* VJ VJ VJxi J_ £i 


PCCCTPAPPP 


TPTTPAAPPT 


TPPTPAPAPT 


PTPPTTPPPP 


Civ/V/ X uL 1 . uii_ 




AGAATTGATT 


TGGTCGGCTG 

X V-* v_j x ✓ VJ v_J v__* X VJ 


PTGPPPAPPT 

v_r x w v_» v_j jtxvj v_» x 


PTTPTAPTPP 

V_f X X V_- X _*1V_, X V_.V_. r 


PPTPPPPTTPt 

WO X X X VJ 


TPTPAPPPA A 
X v_- X Urio \_ W/iLrt. 


DjOU 


TGGCGAGCCG 


ACTGTTAAGT 

JtaVw** X V? X X Xmxvj X 


TPTATAPATP 

X V_7 X ii X iiv — 'XX X V_* 


TPTAPAPAAT 

X vj X -^Oiivj-f^jrA X 


P,PTPAPPAPrP 


ATA APPPTAT 

i-l X _-Vrtv_f v3 vj X Jr\. X 


fi_i s>n 


TGCAATCCCG 


CATGACAT T G 

^jtfi X wXivi 1 X X VJ 


APPTPPPAPA 


ATPTPPTPTP 

ri. X v_» X v_>vj X vj X VJ 


PrT T A T T P A PtPt 


ATTATPATA A 




CCAACATGAA 


CAAGATCGGC 


PPAPPPPTTP 

\_.vjjr_,v_.vjv_>v_, x X v_. 


TPPAPPPPPA 


TPPPP.PPPTT 

X ^UvUV^^/ X X 


TPTPTPTPPT* 


fiR4 n 


TCGAGCTAAT 


GATGTGCTTT 


GGCTCTPTPT 

Wwv/ X V»* X V_» X v_* X 


PAPPPPTPPP 


GAfJTATPrAPP 


APTPPAPTTA 


fifir>n 

O vJ v/U 


TGGCTCTTCG 


ACTGGCCCAG 


TTTATGTTTC 

XXX ii X n_j XX X Vy 


TGAPTPTGTG 

X vJi*V X *_* X V_J x vj 


APPTTGGTTA 

iiv v_*> X X V-J V-J X X xx 


ATGTTPPGAP 

— x vj x x vj v vjriVrf 




CGGCGCGCAG 


GCCGTTGCCC 


GGTCGCTCGA 


TTGGACCAAG 


GTCACACTTG 


ACGGTCGCPP 


6720 


CCTCTCCACC 


ATCCAGCAGT 


ACTCGAAGAC 


CTTCTTTGTC 


CTGCCGCTCC 


GCGGTAAGCT 


6780 


CTCTTTCTGG 


GAGGCAGGCA 


CAACTAAAGC 


CGGGTACCCT 


TATAATTATA 


ACACCACTGC 


6840 


TAGC GACCAA 


CTGCTTGTCG 


AGAATGCCGC 


CGGGCACCGG 


GTCGCTATTT 


CCACTTACAC 


6900 


CACTAGCCTG 


GGTGCTGGTC 


CCGTCTCCAT 


TTCTGCGGTT 


GCCGTTTTAG 


CCCCCCACTC 


6960 


TGCGCTAGCA 


TTGCTTGAGG 


ATACCTTGGA 


CTACCCTGCC 


CGCGCCCATA 


CTTTTGATGA 


7020 


TTTCTGCCCA 


GAGTGCCGCC 


CCCTTGGCCT 


TCAGGGCTGC 


GCTTTCCAGT 


CTACTGTCGC 


7080 


TGAGCTTCAG 


CGCCTTAAGA 


TGAAGGTGGG 


TAAAACTCGG 


GAGTTGTAGT 


TTATTTGCTT 


7140 
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GTGCCCCCCT TCTTTCTGTT GCTTATTTCT CATTTCTGCG TTCCGCGCTC CCTGA 7195 



(2) INFORMATION FOR SEQ ID NO: 7; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1693 amino acids 

(B) TYPE: amino acid 
( D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Glu Ala His Gin Phe He Lys Ala Pro Gly He Thr Thr Ala He 
1 5 10 15 

Glu Gin Ala Ala Leu Ala Ala Ala Asn Ser Ala Leu Ala Asn Ala Val 
20 25 30 

Val Val Arg Pro Phe Leu Ser His Gin Gin He Glu He Leu He Asn 
35 . 40 45 

Leu Met Gin Pro Arg Gin Leu Val Phe Arg Pro Glu Val Phe Trp Asn 
50 55 ( 60 

His Pro He Gin Arg Val He His Asn Glu Leu Glu Leu Tyr Cys Arg 
65 70 75 80 

Ala Arg Ser Gly Arg Cys Leu Glu He Gly Ala His Pro Arg Ser He 
85 90 95 

Asn Asp Asn Pro Asn Val Val His Arg Cys Phe Leu Arg Pro Val Gly 
100 105 110 

Arg Asp Val Gin Arg Trp Tyr Thr Ala Pro Thr Arg Gly Pro Ala Ala 
115 120 125 

Asn Cys Arg Arg Ser Ala Leu Arg Gly Leu Pro Ala Ala Asp Arg Thr 
130 135 140 

Tyr Cys Leu Asp Gly Phe Ser Gly Cys Asn Phe Pro Ala Glu Thr Gly 
145 150 155 160 

He Ala Leu Tyr Ser Leu His Asp Met Ser Pro Ser Asp Val Ala Glu 
165 170 175 

Ala Met Phe Arg His Gly Met Thr Arg Leu Tyr Ala Ala Leu His Leu 
180 185 190 

Pro Pro Glu Val Leu Leu Pro Pro Gly Thr Tyr Arg Thr Ala Ser Tyr 
195 200 205 

Leu Leu He His Asp Gly Arg Arg Val Val Val Thr Tyr Glu Gly Asp 
210 215 220 

Thr Ser Ala Gly Tyr Asn His Asp Val Ser Asn Leu Arg Ser Trp He 
225 230 235 240 

Arg Thr Thr Lys Val Thr Gly Asp His Pro Leu Val He Glu Arg Val 
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245 



250 



255 



Arg Ala lie Gly Cys His Phe Val Leu Leu Leu Thr Ala Ala Pro Glu 
260 265 270 

Pro Ser Pro Met Pro Tyr Val Pro Tyr Pro Arg Ser Thr Glu Val Tyr 
275 280 285 

Val Arg Ser lie Phe Gly Pro Gly Gly Thr Pro Ser Leu Phe Pro Thr 
290 295 300 



Ser Cys Ser Thr Lys Ser Thr Phe His Ala Val Pro Ala His lie Trp 
305 310 315 320 

Asp Arg Leu Met Leu Phe Gly Ala Thr Leu Asp Asp Gin Ala Phe Cys 
325 330 335 

Cys Ser Arg Leu Met Thr Tyr Leu Arg Gly lie Ser Tyr Lys Val Thr 
340 345 350 



Val Gly Thr Leu Val Ala Asn Glu Gly Trp Asn Ala Ser Glu Asp Ala 
355 360 365 



Leu Thr Ala Val lie Thr Ala Ala Tyr Leu Thr lie Cys His Gin Arg 
370 375 380 



Tyr Leu Arg Thr Gin Ala lie Ser Lys Gly Met Arg Arg Leu Glu Arg 
385 390 395 400 

Glu His Ala Gin Lys Phe lie Thr Arg Leu Tyr Ser Trp Leu Phe Glu 
405 410 415 

Lys Ser Gly Arg Asp Tyr He Pro Gly Arg Gin Leu Glu Phe Tyr Ala 
420 425 430 

Gin Cys Arg Arg Trp Leu Ser Ala Gly Phe His Leu Asp Pro Arg Val 
435 440 445 

Leu Val Phe Asp Glu Ser Ala Pro Cys His Cys Arg Thr Ala He Arg 
450 455 460 

Lys Ala Leu Ser Lys Phe Cys Cys Phe Met Lys Trp Leu Gly Gin Glu 
465 470 475 480 

Cys Thr Cys Phe Leu Gin Pro Ala Glu Gly Ala Val Gly Asp Gin Gly 
485 490 495 

His Asp Asn Glu Ala Tyr Glu Gly Ser Asp Val Asp Pro Ala Glu Ser 
500 505 510 



Ala He Ser Asp He Ser Gly Ser Tyr Val Val Pro Gly Thr Ala Leu 
515 520 525 

Gin Pro Leu Tyr Gin Ala Leu Asp Leu Pro Ala Glu He Val Ala Arg 
530 535 540 

Ala Gly Arg Leu Thr Ala Thr Val Lys Val Ser Gin Val Asp Gly Arg 
545 550 555 560 



He Asp Cys Glu Thr Leu Leu Gly Asn Lys Thr Phe Arg Thr Ser Phe 



565 



570 



575 
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Val Asp Gly Ala Val Leu Glu Thr Asn Gly Pro Glu Arg His Asn Leu 
580 585 590 

Ser Phe Asp Ala Ser Gin Ser Thr Met Ala Ala Gly Pro Phe Ser Leu 
595 600 605 

Thr Tyr Ala Ala Ser Ala Ala Gly Leu Glu Val Arg Tyr Val Ala Ala 
610 615 620 

Gly Leu Asp His Arg Ala Val Phe Ala Pro Gly Val Ser Pro Arg Ser 
625 630 635 640 

Ala Pro Gly Glu Val Thr Ala Phe Cys Ser Ala Leu Tyr Arg Phe Asn 
645 650 655 

Arg Glu Ala Gin Arg His Ser Leu lie Gly Asn Leu Trp Phe His Pro 
660 665 670 

Glu Gly Leu lie Gly Leu Phe Ala Pro Phe Ser Pro Gly His Val Trp 
675 680 685 

Glu Ser Ala Asn Pro Phe Cys Gly Glu Ser Thr Leu Tyr Thr Arg Thr 
690 695 700 

Trp Ser Glu Val Asp Ala Val Ser Ser Pro Ala Arg Pro Asp Leu Gly 
705 710 715 720 

Phe Met Ser Glu Pro Ser He Pro Ser Arg Ala Ala Thr Pro Thr Leu 
725 730 735 

Ala Ala Pro Leu Pro Pro Pro Ala Pro Asp Pro Ser Pro Pro Pro Ser 
740 745 750 

Ala Pro Ala Leu Ala Glu Pro Ala Ser Gly Ala Thr Ala Gly Ala Pro 
755 760 765 

Ala He Thr His Gin Thr Ala Arg His Arg Arg Leu Leu Phe Thr Tyr 
770 775 780 

Pro Asp Gly Ser Lys Val Phe Ala Gly Ser Leu Phe Glu Ser Thr Cys 
785 790 795 800 

Thr Trp Leu Val Asn Ala Ser Asn Val Asp His Arg Pro Gly Gly Gly 
805 810 815 

Leu Cys His Ala Phe Tyr Gin Arg Tyr Pro Ala Ser Phe Asp Ala Ala 
820 825 830 

Ser Phe Val Met Arg Asp Gly Ala Ala Ala Tyr Thr Leu Thr Pro Arg 
835 840 845 

Pro He He His Ala Val Ala Pro Asp Tyr Arg Leu Glu His Asn Pro 
850 855 860 

Lys Arg Leu Glu Ala Ala Tyr Arg Glu Thr Cys Ser Arg Leu Gly Thr 
865 870 875 880 

Ala Ala Tyr Pro Leu Leu Gly Thr Gly He Tyr Gin Val Pro He Gly 
885 890 895 

Pro Ser Phe Asp Ala Trp Glu Arg Asn His Arg Pro Gly Asp Glu Leu 
900 905 910 
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Tyr Leu Pro Glu Leu Ala Ala Arg Trp Phe Glu Ala Asn Arg Pro Thr 
915 920 925 

Arg Pro Thr Leu Thr lie Thr Glu Asp Val Ala Arg Thr Ala Asn Leu 
930 935 940 

Ala lie Glu Leu Asp Ser Ala Thr Asp Val Gly Arg Ala Cys Ala Gly 
945 950 955 960 

Cys Arg Val Thr Pro Gly Val Val Gin Tyr Gin Phe Thr Ala Gly Val 
965 970 975 

Pro Gly Ser Gly Lys Ser Arg Ser lie Thr Gin Ala Asp Val Asp Val 
980 985 990 

Val Val Val Pro Thr Arg Glu Leu Arg Asn Ala Trp Arg Arg Arg Gly 
995 1000 1005 

Phe Ala Ala Phe Thr Pro His Thr Ala Ala Arg Val Thr Gin Gly Arg 
1010 1015 1020 

Arg Val Val lie Asp Glu Ala Pro Ser Leu Pro Pro His Leu Leu Leu 
1025 1030 1035 1040 

Leu His Met Gin Arg Ala Ala Thr Val His Leu Leu Gly Asp Pro Asn 
1045 1050 1055 

Gin lie Pro Ala lie Asp Phe Glu His Ala Gly Leu Val Pro Ala lie 
1060 1065 107 0 

Arg Pro Asp Leu Gly Pro Thr Ser Trp Trp His Val Thr His Arg Trp 
1075 1080 1085 

Pro Ala Asp Val Gys Glu Leu lie Arg Gly Ala Tyr Pro Met lie Gin 
1090 1095 1100 

Thr Thr Ser Arg Val Leu Arg Ser Leu Phe Trp Gly Glu Pro Ala Val 
1105 1110 1115 1120 

Gly Gin Lys Leu Val Phe Thr Gin Ala Ala Lys Pro Ala Asn Pro Gly 
1125 1130 1135 

Ser Val Thr Val His Glu Ala Gin Gly Ala Thr Tyr Thr Glu Thr Thr 
1140 1145 1150 

lie lie Ala Thr Ala Asp Ala Arg Gly Leu lie Gin Ser Ser Arg Ala 
1155 1160 1165 

His Ala He Val Ala Leu Thr Arg His Thr Glu Lys Cys Val He He 
1170 1175 1180 

Asp Ala Pro Gly Leu Leu Arg Glu Val Gly He Ser Asp Ala He Val 
1185 1190 1195 1200 

Asn Asn Phe Phe Leu Ala Gly Gly Glu He Gly His Gin Arg Pro Ser 
1205 1210 1215 

Val He Pro Arg Gly Asn Pro Asp Ala Asn Val Asp Thr Leu Ala Ala 
1220 1225 1230 

Phe Pro Pro Ser Cys Gin He Ser Ala Phe His Gin Leu Ala Glu Glu 
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1235 1240 1245 

Leu Gly His Arg Pro Val Pro Val Ala Ala Val Leu Pro Pro Cys Pro 
1250 1255 1260 

Glu Leu Glu Gin Gly Leu Leu Tyr Leu Pro Gin Glu Leu Thr Thr Cys 
1265 1270 1275 1280 

Asp Ser Val Val Thr Phe Glu Leu Thr Asp lie Val His Cys Arg Met 
1285 1290 1295 

Ala Ala Pro Ser Gin Arg Lys Ala Val Leu Ser Thr Leu Val Gly Arg 
1300 1305 1310 

Tyr Gly Gly Arg Thr Lys Leu Tyr Asn Ala Ser His Ser Asp Val Arg 
1315 1320 1325 

Asp Ser Leu Ala Arg Phe lie Pro Ala lie Gly Pro Val Gin Val Thr 
1330 1335 1340 

Thr Cys Glu Leu Tyr Glu Leu Val Glu Ala Met Val Glu Lys Gly Gin 
1345 1350 1355 1360 

Asp Gly Ser Ala Val Leu Glu Leu Asp Leu Cys Asn Arg Asp Val Ser 
1365 1370 1375 

Arg lie Thr Phe Phe Gin Lys Asp Cys Asn Lys Phe Thr Thr Gly Glu 
1380 1385 1390 

Thr He Ala His Gly Lys Val Gly Gin Gly He Ser Ala Trp Ser Lys 
1395 1400 1405 

Thr Phe Cys Ala Leu Phe Gly Pro Trp Phe Arg Ala He Glu Lys Ala 
1410 1415 1420 

He Leu Ala Leu Leu Pro Gin Gly Val Phe Tyr Gly Asp Ala Phe Asp 
1425 1430 1435 1440 

Asp Thr Val Phe Ser Ala Ala Val Ala Ala Ala Lys Ala Ser Met Val 
1445 1450 1455 

Phe Glu Asn Asp Phe Ser Glu Phe Asp Ser Thr Gin Asn Asn Phe Ser 
1460 1465 1470 

Leu Gly Leu Glu Cys Ala He Met Glu Glu Cys Gly Met Pro Gin Trp 
1475 1480 1485 

Leu He Arg Leu Tyr His Leu He Arg Ser Ala Trp He Leu Gin Ala 
1490 1495 1500 

Pro Lys Glu Ser Leu Arg Gly Phe Trp Lys Lys His Ser Gly Glu Pro 
1505 1510 1515 1520 

Gly Thr Leu Leu Trp Asn Thr Val Trp Asn Met Ala Val He Thr His 
1525 1530 1535 

Cys Tyr Asp Phe Arg Asp Phe Gin Val Ala Ala Phe Lys Gly Asp Asp 
1540 1545 1550 

Ser lie Val Leu Cys Ser Glu Tyr Arg Gin Ser Pro Gly Ala Ala Val 
1555 1560 1565 
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Leu lie Ala Gly Cys Gly Leu Lys Leu Lys Val Asp Phe Arg Pro lie 
1570 1575 1580 

Gly Leu Tyr Ala Gly Val Val Val Ala Pro Gly Leu Gly Ala Leu Pro 
1585 1590 1595 1600 

Asp Val Val Arg Phe Ala Gly Arg Leu Thr Glu Lys Asn Trp Gly Pro 
1605 1610 1615 

Gly Pro Glu Arg Ala Glu Gin Leu Arg Leu Ala Val Ser Asp Phe Leu 
1620 1625 1630 

Arg Lys Leu Thr Asm Val Ala Gin Met Cys Val Asp Val Val Ser Arg 
1635 1640 1645 

Val Tyr Gly Val Ser Pro Gly Leu Val His Asn Leu lie Gly Met Leu 
1650 1655 1660 

Gin Ala Val Ala Asp Gly Lys Ala His Phe Thr Glu Ser Vai Lys Pro 
1665 1670 1675- 1680 

Val Leu Asp Leu Thr Asn Ser lie Leu Cys Arg Val Glu 
1685 1690 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 660 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

Met Arg Pro Arg Pro lie Leu Leu Leu Leu Leu Met Phe Leu Pro Met 
1 5 10 15 

Leu Pro Ala Pro Pro Pro Gly Gin Pro Ser Gly Arg Arg Arg Gly Arg 
20 25 30 

Arg Ser Gly Gly Ser Gly Gly Gly Phe Trp Gly Asp Arg Val Asp Ser 
35 40 45 

Gin Pro Phe Ala lie Pro Tyr He His Pro Thr Asn Pro Phe Ala Pro 
50 55 60 

Asp Val Thr Ala Ala Ala Gly Ala Gly Pro Arg Val Arg Gin Pro Ala 
65 70 75 80 

Arg Pro Leu Gly Ser Ala Trp Arg Asp Gin Ala Gin Arg Pro Ala Val 
85 90 95 

Ala Ser Arg Arg Arg Pro Thr Thr Ala Gly Ala Ala Pro Leu Thr Ala 
100 105 110 

Val Ala Pro Ala His Asp Thr Pro Pro Val Pro Asp Val Asp Ser Arg 
115 120 125 

Gly Ala He Leu Arg Arg Gin Tyr Asn Leu Ser Thr Ser Pro Leu Thr 
130 135 140 
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Ser Ser Val Ala Thr Gly Thr Asn Leu Val Leu Tyr Ala Ala Pro Leu 
145 150 155 160 

Ser Pro Leu Leu Pro Leu Gin Asp Gly Thr Asn Thr His lie Met Ala 
165 170 175 

Thr Glu Ala Ser Asn Tyr Ala Gin Tyr Arg Val Ala Arg Ala Thr lie 
180 185 190 

Arg Tyr Arg Pro Leu Val Pro Asn Ala Val Gly Gly Tyr Ala lie Ser 
195 200 205 

He Ser Phe Trp Pro Gin Thr Thr Thr Thr Pro Thr Ser Val Asp Met 
210 215 220 

Asn Ser He Thr Ser Thr Asp Val Arg He Leu Val Gin Pro Gly He 
225 230 235 240 

Ala Ser Glu Leu Val He Pro Ser Glu Arg Leu His Tyr Arg Asn Gin 
245 250 255 

Gly Trp Arg Ser Val Glu Thr Ser Gly Val Ala Glu Glu Glu Ala Thr 
260 265 270 

Ser Gly Leu Val Met Leu Cys He His Gly Ser Leu Val Asn Ser Tyr 
275 280 285 

Thr Asn Thr Pro Tyr Thr Gly Ala Leu Gly Leu Leu Asp Phe Ala Leu 
290 295 300 

Glu Leu Glu Phe Arg Asn Leu Thr Pro Gly Asn Thr Asn Thr Arg Val 
305 310 315 320 

Ser Arg Tyr Ser Ser Thr Ala Arg His Arg Leu Arg Arg Gly Ala Asp 
325 330 335 

Gly Thr Ala Glu Leu Thr Thr Thr Ala Ala Thr Arg Phe Met Lys Asp 
340 345 350 

Leu Tyr Phe Thr Ser Thr Asn Gly Val Gly Glu He Gly Arg Gly lie 
355 360 365 

Ala Leu Thr Leu Phe Asn Leu Ala Asp Thr Leu Leu Gly Gly Leu Pro 
370 375 380 

Thr Glu Leu He Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 
385 390 395 400 

Val Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
405 410 415 

Glu Asn Ala Gin Gin Asp Lys Gly He Ala He Pro His Asp He Asp 
420 425 430 

Leu Gly Glu Ser Arg Val Val He Gin Asp Tyr Asp Asn Gin His Glu 
435 440 445 

Gin Asp Arg Pro Thr Pro Ser Pro Ala Pro Ser Arg Pro Phe Ser Val 
450 455 460 
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Leu Arg Ala Asn Asp Val Leu Trp Leu Ser Leu Thr Ala Ala Glu Tyr 
465 470 475 480 

Asp Gin Ser Thr Tyr Gly Ser Ser Thr Gly Pro Val Tyr Val Ser Asp 
485 490 495 

Ser Val Thr Leu Val Asn Val Ala Thr Gly Ala Gin Ala Val Ala Arg 
500 505 510 

Ser Leu Asp Trp Thr Lys Val Thr Leu Asp Gly Arg Pro Leu Ser Thr 
515 520 525 

lie Gin Gin Tyr Ser Lys Thr Phe Phe Val Leu Pro Leu Arg Gly Lys 
530 535 540 

Leu Ser Phe Trp Glu Ala Gly Thr Thr Lys Ala Gly Tyr Pro Tyr Asn 
545 550 555 560 

Tyr Asn Thr Thr Ala Ser Asp Gin Leu Leu Val Glu Asn Ala Ala Gly 
565 570 575 

His Arg Val Ala lie Ser Thr Tyr Thr Thr Ser Leu Gly Ala Gly Pro 
580 585 590 

Val Ser lie Ser Ala Val Ala Val Leu Ala Pro His Ser Ala Leu Ala 
595 600 605 

Leu Leu Glu Asp Thr Leu Asp Tyr Pro Ala Arg Ala His Thr Phe Asp 
610 615 620 

Asp Phe Cys Pro Glu Cys Arg Pro Leu Gly Leu Gin Gly Cys Ala Phe 
625 630 635 640 

Gin Ser Thr Val Ala Glu Leu Gin Arg Leu Lys Met Lys Val Gly Lys 
645 650 655 

Thr Arg Glu Leu 
660 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 123 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Asn Asn Met Ser Phe Ala Ala Pro Met Gly Ser Arg Pro Cys Ala 
1 5 10 15 

Leu Gly Leu Phe Cys Cys Cys Ser Ser Cys Phe Cys Leu Cys Cys Pro 
20 25 30 

Arg His Arg Pro Val Ser Arg Leu Ala Ala Val Val Gly Gly Ala Ala 
35 40 45 

Ala Val Pro Ala Val Val Ser Gly Val Thr Gly Leu lie Leu Ser Pro 
50 55 60 
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Ser Gin Ser Pro lie Phe He Gin Pro Thr Pro Ser Pro Pro Met Ser 
65 70 75 80 

Pro Leu Arg Pro Gly Leu Asp Leu Val Phe Ala Asn Pro Pro Asp His 
85 90 95 

Ser Ala Pro Leu Gly Val Thr Arg Pro Ser Ala Pro Pro Leu Pro His 
100 105 110 

Val Val Asp Leu Pro Gin Leu Gly Pro Arg Arg 
115 120 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: Composite Mexico strain 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



GCCATGGAGG 


CCCACCAGTT 


CATTAAGGCT 


CCTGGCATCA 


CTACTGCTAT 


TGAGCAAGCA 


60 


GCTCTAGCAG 


CGGCCAACTC 


CGCCCTTGCG 


AATGCTGTGG 


TGGTCCGGCC 


TTTCCTTTCC 


120 


CATCAGCAGG 


TTGAGATCCT 


TATAAATCTC 


ATGCAACCTC 


GGCAGCTGGT 


GTTTCGTCCT 


180 


GAGGTTTTTT 


GGAATCACCC 


GAT TCAACGT 


GTTATACATA ATGAGCTTGA 


GCAGTATTGC 


240 


CGTGCTCGCT 


CGGGTCGCTG 


CCTTGAGATT 


GGAGCCCACC 


CACGCTCCAT 


TAATGATAAT 


300 


CCTAATGTCC 


TCCATCGCTG 


CTTTCTCCAC 


CCCGTCGGCC 


GGGATGTTCA 


GCGCTGGTAC 


360 


ACAGCCCCGA 


CTAGGGGACC 


TGCGGCGAAC 


TGTCGCCGCT 


CGGCACTTCG 


TGGTCTGCCA 


420 


CCAGCCGACC 


GCACTTACTG 


TTTTGATGGC 


TTTGCCGGCT 


GCCGTTTTGC 


CGCCGAGACT 


. 480 


GGTGTGGCTC 


TCTATTCTCT 


CCATGACTTG 


CAGCCGGCTG 


ATGTTGCCGA 


GGCGATGGCT 


540 


CGCCACGGCA 


TGACCCGCCT 


TTATGCAGCT 


TTCCACTTGC 


CTCCAGAGGT 


GCTCCTGCCT 


600 


CCTGGCACCT 


ACCGGACATC 


ATCCTACTTG 


CTGATCCACG 


ATGGTAAGCG 


CGCGGTTGTC 


660 


ACTTATGAGG 


GTGACACTAG 


CGCCGGTTAC 


AATCATGATG 


TTGCCACCCT 


CCGCACATGG 


720 


ATCAGGACAA 


CTAAGGTTGT 


GGGTGAACAC 


CCTTTGGTGA 


TCGAGCGGGT 


GCGGGGTATT 


780 


GGCTGTCACT 


TTGTGTTGTT 


GATCACTGCG 


GCCCCTGAGC 


CCTCCCCGAT 


GCCCTACGTT 


840 


CCTTACCCGC 


GTTCGACGGA 


GGTCTATGTC 


CGGTCTATCT 


TTGGGCCCGG 


CGGGTCCCCG 


900 
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TCGCTGTTCC 


CGACCGCTTG 


TGCTGTCAAG 


TCCACTTTTC 


ACGCCGTCCC 


CACGCACATC 


960 


TGGGACCGTC 


TCATGCTCTT 


TGGGGCCACC 


CTCGACGACC 


AGGCCTTTTG 


CTGCTCCAGG 


1020 


CTTATGACGT 


ACCTTCGTGG 


CATTAGCTAT 


AAGGTAACTG 


TGGGTGCCCT 


GGTCGCTAAT 


1080 


GAAGGCTGGA 


ATGCCACCGA 


GGATGCGCTC 


ACTGCAGTTA 


TTACGGCGGC 


TTACCTCACA 


1140 


ATATGTCATC 


AGCGTTATTT 


GCGGACCCAG 


GCGATTTCTA 


AGGGCATGCG 


CCGGCTTGAG 


1200 


CTTGAACATG 


CTCAGAAATT 


TATTTCACGC 


CTCTACAGCT 


GGCTATTTGA 


GAAGTCAGGT 


1260 


CGTGATTACA 


TCCCAGGCCG 


CCAGCTGCAG 


TTCTACGCTC 


AGTGCCGCCG 


CTGGTTATCT 


1320 


GCCGGGTTCC 


ATCTCGACCC 


CCGCACCTTA 


GTTTTTGATG 


AGTCAGTGCC 


TTGTAGCTGC 


1380 


CGAACCACCA 


TCCGGCGGAT 


CGCTGGAAAA 


TTTTGCTGTT 


TTATGAAGTG 


GCTCGGTCAG 


1440 


GAGTGTTCTT 


GTTTCCTCCA 


GCCCGCCGAG 


GGGCTGGCGG 


GCGACCAAGG 


TCATGACAAT 


1500 


GAGGCCTATG 


AAGGCTCTGA 


TGTTGATACT 


GCTGAGCCTG 


CCACCCTAGA 


CATTACAGGC 


1560 


TCATACATCG 


TGGATGGTCG 


GTCTCTGCAA 


ACTGTCTATC 


AAGCTCTCGA 


CCTGCCAGCT 


1620 


GACCTGGTAG 


CTCGCGCAGC 


CCGACTGTCT 


GCTACAGTTA 


CTGTTACTGA 


AACCTCTGGC 


1680 


CGTCTGGATT 


GCCAAACAAT 


GATCGGCAAT 


AAGACTTTTC 


TCACTACCTT 


TGTTGATGGG 


1740 


GCACGCCTTG 


AGGTTAACGG 


GCCTGAGCAG 


CTTAACCTCT 


CTTTTGACAG 


CCAGCAGTGT 


1800 


AGTATGGCAG 


CCGGCCCGTT 


TTGCCTCACC 


TATGCTGCCG 


TAGAT GGCGG 


GCTGGAAGTT 


1860 


CATTTTTCCA 


CCGCTGGCCT 


CGAGAGCCGT 


GTTGTTTTCC 


CCCCTGGTAA 


TGCCCCGACT 


1920 


GCCCCGCCGA 


GTGAGGTCAC 


CGCCTTCTGC 


TCAGCTCTTT 


ATAGGCACAA 


CCGGCAGAGC 


1980 


CAGCGCCAGT 


CGGTTATTGG 


TAGTTTGTGG 


CTGCACCCTG 


AAGGTTTGCT 


CGGCCTGTTC 


2040 


CCGCCCTTTT 


CACCCGGGCA 


TGAGTGGCGG 


TCTGCTAACC 


CATTTTGCGG 


CGAGAGCACG 


2100 


CTCTACACCC 


GCACTTGGTC 


C AC AAT TAG A 


GACACACCCT 


TAACTGTCGG 


GCTAATTTCC 


2160 


GGTCATTTGG 


ATGCTGCTCC 


CCACTCGGGG 


GGGCCACCTG 


CTACTGCCAC 


AGGCCCTGCT 


2220 


GTAGGCTCGT 


CTGACTCTCC 


AGACCCTGAC 


CCGCTACCTG 


ATGTTACAGA 


TGGCTCACGC 


2280 


CCCTCTGGGG 


CCCGTCCGGC 


TGGCCCCAAC 


CCGAATGGCG 


TTCCGCAGCG 


CCGCTTACTA 


2340 


CACACCTACC 


CTGACGGCGC 


TAAGATCTAT 


GTCGGCTCCA 


TTTTCGAGTC 


TGAGTGCACC 


2400 


TGGCTTGTCA 


ACGCATCTAA 


CGCCGGCCAC 


CGCCCTGGTG 


GCGGGCTTTG 


TCATGCTTTT 


2460 


TTTCAGCGTT 


ACCCTGATTC 


GTTT'GACGCC 


ACCAAGTTTG 


TGATGCGTGA 


TGGTCTTGCC 


2520 


GCGTATACCC 


TTACACCCCG 


GCCGATCATT 


CATGCGGTGG 


CCCCGGACTA 


TCGATTGGAA 


2580 


CATAACCCCA 


AGAGGCTCGA 


GGCTGCCTAC 


CGCGAGACTT 


GCGCCCGCCG 


AGGCACTGCT 


2640 


GCCTATCCAC 


TCTTAGGCGC 


TGGCATTTAC 


CAGGTGCCTG 


TTAGTTTGAG 


TTTTGATGCC 


2700 


TGGGAGCGGA 


ACCACCGCCC 


GTTTGACGAG 


CTTTACCTAA 


CAGAGCTGGC 


GGCTCGGTGG 


2760 
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TTTGAATCCA 


ACCGCCCCGG 


TCAGCCCACG 


TTGAACATAA 


CTGAGGATAC 


CGCCCGTGCG 


2820 


GCCAACCTGG 


CCCTGGAGCT 


TGACTCCGGG 


AGTGAAGTAG 


GCCGCGCATG 


TGCCGGGTGT 


2880 


AAAGTCGAGC 


CTGGCGTTGT 


GCGGTATCAG 


TTTACAGCCG 


GTGTCCCCGG 


CTCTGGCAAG 


2940 


TCAAAGTCCG 


TGCAACAGGC 


GGATGTGGAT 


GTTGTTGTTG 


TGCCCACTCG 


CGAGCTTCGG 


3000 


AACGCTTGGC 


GGCGCCGGGG 


CTTTGCGGCA 


TTCACTCCGC 


ACACTGCGGC 


CCGTGTCACT 


3060 


AGCGGCCGTA 


GGGTTGTCAT 


TGATGAGGCC 


CCTTCGCTCC 


CCCCACACTT 


GCTGCTTTTA 


3120 


CATATGCAGC 


GTGCTGCATC 


TGTGCACCTC 


CTTGGGGACC 


CGAATCAGAT 


CCCCGCCATA 


3180 


GATTTTGAGC 


ACACCGGTCT 


GATTCCAGCA 


ATACGGCCGG 


AGTTGGTCCC 


GACTTCATGG 


3240 


TGGCATGTCA 


CCCACCGTTG 


CCCTGCAGAT 


GTCTGTGAGT 


TAGTCCGTGG 


TGCTTACCCT 


3300 


AAAATCCAGA 


CTACAAGTAA 


GGTGCTCCGT 


TCCCTTTTCT 


GGGGAGAGCC 


AGCTGTCGGC 


3360 


CAGAAGCTAG 


TGTTCACACA 


GGCTGCTAAG 


GCCGCGCACC 


CCGGATCTAT 


AACGGTCCAT 


3420 


GAGGCCCAGG 


GTGCCACTTT 


TACCACTACA 


ACTATAATTG 


CAACTGCAGA 


TGCCCGTGGC 


3480 


CTCATACAGT 


CCTCCCGGGC 


TCACGCTATA 


GTTGCTCTCA 


CTAGGCATAC 


TGAAAAATGT 


3540 


GTTATACTTG 


ACTCTCCCGG 


CCTGTTGCGT 


GAGGTGGGTA 


TCTCAGATGC 


CATTGTTAAT 


3600 


AATTTCTTCC 


TTTCGGGTGG 


CGAGGTTGGT 


CACCAGAGAC 


CATCGGTCAT 


TCCGCGAGGC 


3660 


AACCCTGACC 


GCAATGTTGA 


CGTGCTTGCG 


GCGTTTCCAC 


CTTCATGCCA 


AATAAGCGCC 


3720 


TTCCATCAGC 


TTGCTGAGGA 


GCTGGGCCAC 


CGGCCGGCGC 


CGGTGGCGGC 


TGTGCTACCT 


3780 


CCCTGCCCTG 


AGCTTGAGCA 


GGGCCTTCTC 


TATCTGCCAC 


AGGAGCTAGC 


CTCCTGTGAC 


3840 


AGTGTTGTGA 


CATTTGAGCT 


AACTGACATT 


GTGCACTGCC 


GCATGGCGGC 


CCCTAGCGAA 


3900 


AGGAAAGCTG 


TTTTGTCCAC 


GCTGGTAGGC 


CGG TATGGCA 


GACGCACAAG 


GCTTTATGAT 


3960 


GCGGGTCACA 


CCGATGTCCG 


CGCCTCCCTT 


GCGCGCTTTA 


T.TCCCACTCT 


CGGGCGGGTT 


4020 


ACTGCCACCA 


CCTGTGAACT 


CTTTGAGCTT 


GTAGAGGCGA 


TGGTGGAGAA 


GGGCCAAGAC 


4080 


GGTTCAGCCG 


TCCTCGAGTT 


GGATTTGTGC 


AGCCGAGATG 


TCTCCCGCAT 


AACCTTTTTC 


4140 


CAGAAGGATT 


GTAACAAGTT 


CACGACCGGC 


GAGACAATTG 


CGCATGGCAA 


AGTCGGTCAG 


4200 


GGTATCTTCC 


GCTGGAGTAA 


GACGTTTTGT 


GCCCTGTTTG 


GCCCCTGGTT 


CCGTGCGATT 


4260 


GAGAAGGCTA 


TTCTATCCCT 


TTTACCACAA 


GCTGTGTTCT 


ACGGGGATGC 


TTATGACGAC 


4320 


TCAGTATTCT 


CTGCTGCCGT 


GGCTGGCGCC 


AGCCATGCCA 


TGGTGTTTGA 


AAATGATTTT 


4380 


TCTGAGTTTG 


ACTCGACTCA 


GAATAACTTT 


TCCCTAGGTC 


TTGAGTGCGC 


CATTATGGAA 


4440 


GAGTGTGGTA 


TGCCCCAGTG 


GCTTGTCAGG 


TTGTACCATG 


CCGTCCGGTC 


GGCGTGGATC 


4500 


CTGCAGGCCC 


CAAAAGAGTC 


TTTGAGAGGG 


TTCTGGAAGA 


AGCATTCTGG 


TGAGCCGGGC 


4560 


AGCTTGCTCT 


GGAATACGGT 


GTGGAACATG 


GCAATCATTG 


CCCATTGCTA 


TGAGTTCCGG 


4620 
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GACCTCCAGG 


TTGCCGCCTT 


CAAGGGCGAC 


GACTCGGTCG 


TCCTCTGTAG 


TGAATACCGC 


4680 


CAGAGCCCAG 


GCGCCGGTTC 


GCTTATAGCA 


GGCTGTGGTT 


TGAAGTTGAA 


GGCTGACTTC 


4740 


CGGCCGATTG 


GGCTGTATGC 


CGGGGT.TGTC 


GTCGCCCCGG 


GGCTCGGGGC 


CCTACCCGAT 


4800 


GTCGTTCGAT 


TCGCCGGACG 


GCTTTCGGAG 


AAGAACTGGG 


GGCCTGATCC 


GGAGCGGGCA 


4860 


GAGCAGCTCC 


GCCTCGCCGT 


GCAGGATTTC 


CTCCGTAGGT 


TAACGAATGT 


GGCCCAGATT 


4 920 


TGTGTTGAGG 


TGGTGTCTAG 


AGTTTACGGG 


GTTTCCCCGG 


GTCTGGTTCA 


TAACCTGATA 


4980 


GGCATGCTCC 


AGACTATTGG 


TGATGGTAAG 


GCGCATTTTA 


CAGAGTCTGT 


TAAGCCTATA 


5040 


CTTGACCTTA 


CACACTCAAT 


TATGCACCGG 


TCTGAATGAA 


TAACATGTGG 


TTTGCTGCGC 


5100 


CCATGGGTTC 


GCCACCATGC 


GCCCTAGGCC 


TCTTTTGCTG 


TTGTTCCTCT 


TGTTTCTGCC 


5160 


TATGTTGCCC 


GCGCCACCGA 


CCGGTCAGCC 


GTCTGGCCGC 


CGTCGTGGGC 


GGCGCAGCGG 


5220 


CGGTACCGGC 


GGTGGTTTCT 


GGGGTGACCG 


GGTTGATTCT 


CAGCCCTTCG 


CAATCCCCTA 


5280 


T ATT CAT CCA 


ACCAACCCCT 


TTGCCCCAGA 


CGTTGCCGCT 


GCGTCCGGGT 


CTGGACCTCG 


5340 


CCTTCGCCAA 


CCAGCCCGGC 


CACTTGGCTC 


CACTTGGCGA 


GATCAGGCCC 


AGCGCCCCTC 


5400 


CGCTGCCTCC 


CGTCGCCGAC 


CTGCCACAGC 


CGGGGCTGCG 


GCGCTGACGG 


CTGTGGCGCC 


5460 


TGCCCATGAC 


ACCTCACCCG 


TCCCGGACGT 


TGATTCTCGC 


GGTGCAATTC 


TACGCCGCCA 


5520 


GTATAATTTG 


TCTACTTCAC 


CCCTGACATC 


CTCTGTGGCC 


TCTGGCACTA 


ATTTAGTCCT 


5580 


GTATGCAGCC 


CCCCTTAATC 


CGCCTCTGCC 


GCTGCAGGAC 


GGTACTAATA 


C T C AC AT TAT 


5640 


GGCCACAGAG 


GCCTCCAATT 


AT GCACAGTA 


CCGGGTTGCC 


CGCGCTACTA 


TCCGTTACCG 


5700 


GCCCCTAGTG 


CCTAATGCAG 


TTGGAGGCTA 


TGCTATATCC 


ATTTCTTTCT 


GGCCTCAAAC 


5760 


AACCACAACC 


CCTACATCTG 


TTGACATGAA 


TTCCATTACT 


TCCACTGATG 


TCAGGATTCT 


5820 


TGTTCAACCT 


GG CAT AGCAT 


CTGAATTGGT 


CATCCCAAGC 


GAGCGCCTTC 


ACTACCGCAA 


5880 


TCAAGGTTGG 


CGCTCGGTTG 


AGACATCTGG 


TGTTGCTGAG 


GAGGAAGCCA 


CCTCCGGTCT 


5940 


TGTCATGTTA 


TGCATACATG 


GCTCTCCAGT 


TAACTCCTAT 


ACCAATACCC 


CTTATACCGG 


6000 


TGCCCTTGGC 


TTACTGGACT 


TTGCCTTAGA 


GCTTGAGTTT 


CGCAATCTCA 


CCACCTGTAA 


6060 


CACCAATACA 


CGTGTGTCCC 


GTTACTCCAG 


CACTGCTCGT 


CACTCCGCCC 


GAGGGGCCGA 


6120 


CGGGACTGCG 


GAGCTGACCA 


CAACTGCAGC 


CACCAGGTTC 


ATGAAAGATC 


TCCACTTTAC 


6180 


CGGCCTTAAT 


GGGGTAGGTG 


AAGTCGGCCG 


CGGGATAGCT 


CTAACATTAC 


TTAACCTTGC 


6240 


TGACACGCTC 


CTCGGCGGGC 


TCCCGACAGA 


ATTAATTTCG 


TCGGCTGGCG 


GGCAACTGTT 


6300 


TTATTCCCGC 


CCGGTTGTCT 


CAGCCAATGG 


CGAGCCAACC 


GTGAAGCTCT 


ATACATCAGT 


6360 


GGAGAATGCT 


CAGCAGGATA 


AGGGTGTTGC 


TATCCCCCAC 


GATATCGATC 


TTGGTGATTC 


6420 


GCGTGTGGTC 


ATTCAGGATT 


ATGACAACCA 


GCATGAGCAG 


GATCGGCCCA 


CCCCGTCGCC 


6480 
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TGCGCCATCT 


CGGCCTTTTT 


CTGTTCTCCG 


AGCAAATGAT 


GTACTTTGGC 


TGTCCCTCAC 


6540 


TGCAGCCGAG 


TATGACCAGT 


CCACTTACGG 


GTCGTCAACT 


GGCCCGGTTT 


ATATCTCGGA 


6600 


CAGCGTGACT 


TTGGTGAATG 


TTGCGACTGG 


CGCGCAGGCC 


GTAGCCCGAT 


CGCTTGACTG 


6660 


GTCCAAAGTC 


ACCCTCGACG 


GGCGGCCCCT 


CCCGACTGTT 


GAGCAATATT 


CCAAGACATT 


6720 


CTTTGTGCTC 


CCCCTTCGTG 


GCAAGCTCTC 


CTTTTGGGAG 


GCCGGCACAA 


CAAAAGCAGG 


6780 


TTATCCTTAT 


AATTATAATA 


CTACTGCTAG 


TGACCAGATT 


CTGATTGAAA ATGCTGCCGG 


6840 


CCATCGGGTC 


GCCATTTCAA 


CCTATACCAC 


CAGGCTTGGG 


GCCGGTCCGG 


TCGCCATTTC 


6900 


TGCGGCCGCG 


GTTTTGGCTC 


CACGCTCCGC CCTGGCTCTG 


CTGGAGGATA 


CTTTTGATTA 


6960 


TCCGGGGCGG 


GCGCACACAT 


TTGATGACTT 


CTGCCCTGAA 


TGCCGCGCTT 


TAGGCCTCCA 


7020 


GGGTTGTGCT 


TTCCAGTCAA 


CTGTCGCTGA 


GCTCCAGCGC 


CTTAAAGTTA 


AGGTGGGTAA 


7080 


AACTCGGGAG 


TTGTAGTTTA 


TTTGGCTGTG 


CCCACCTACT 


TATATCTGCT 


GATTTCCTTT 


7140 


ATTTCCTTTT 


TCTCGGTCCC 


GCGCTCCCTG 


A 






7171 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1575 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL : NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: T: Mexican strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GTTGCGTGAG GTGGGTATCT CAGATGCCAT TGTTAATAAT TTCTTCCTTT CGGGTGGCGA 60 

GGTTGGTCAC CAGAGACCAT CGGTCATTCC GCGAGGCAAC CCTGACCGCA ATGTTGACGT 120 

GCTTGCGGCG TTTCCACCTT CATGCCAAAT AAGCGCCTTC CATCAGCTTG CTGAGGAGCT 180 

GGGCCACCGG CCGGCGCCGG TGGCGGCTGT GCTACCTCCC TGCCCTGAGC TTGAGCAGGG 24 0 

CCTTCTCTAT CTGCCACAGG AGCTAGCCTC CTGTGACAGT GTTGTGACAT TTGAGCTAAC 300 

TGACATTGTG CACTGCCGCA TGGCGGCCCC TAGCCAAAGG AAAGCTGTTT TGTCCACGCT 360 

GGTAGGCCGG TATGGCAGAC GCACAAGGCT TTATGATGCG GGTCACACCG ATGTCCGCGC 4 20 

CTCCCTTGCG CGCTTTATTC CCACTCTCGG GCGGGTTACT GCCACCACCT GTGAACTCTT 4 80 

TGAGCTTGTA GAGGCGATGG TGGAGAAGGG CCAAGACGGT TCAGCCGTCC TCGAGTTGGA 54 0 
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TTTGTGCAGC 


CGAGATGTCT 


CCCGCATAAC 


CTTTTTCCAG 


AAGGATTGTA 


ACAAGTTCAC 


600 


GACCGGCGAG 


ACAATTGCGC 


ATGGCAAAGT 


CGGTCAGGGT 


ATCTTCCGCT 


GGAGTAAGAC 


660 


CTTTTGTGCC 


CTGTTTGGCC 


CCTGGTTCC'G 


TGCGATTGAG 


AAGGCTATTC 


TATCCCTTTT 


720 


ACCACAAGCT 


GTGTTCTACG 


GGGATGCTTA 


TGACGACTCA 


GTATTCTCTG 


CTGCCGTGGC 


780 


TGGCGCCAGC 


CATGCCATGG 


TGTTTGAAAA 


TGATTTTTCT 


GAGTTTGACT 


CGACTCAGAA 


840 


TAACTTTTCC 


CTAGGTCTTG 


AGTGCGCCAT 


TATGGAAGAG 


TGTGGTATGC 


CCCAGTGGCT 


900 


TGTCAGGTTG 


TACCATGCCG 


TCCGGTCGGC 


GTGGATCCTG 


CAGGCCCCAA 


AAGAGTCTTT 


960 


GAGAGGGTTC 


TGGAAGAAGC 


ATTCTGGTGA 


GCCGGGCACG 


TTGCTCTGGA 


ATACGGTGTG 


1020 


GAACATGGCA 


ATCATTGCCC 


ATTGCTATGA 


GTTCCGGGAC 


CTCCAGGTTG 


CCGCCTTCAA 


1080 


GGGCGACGAC 


TCGGTCGTCC 


TCTGTAGTGA 


ATACCGCCAG 


AGCCCAGGCG 


CCGGTTCGCT 


1140 


TATAGCAGGC 


TGTGGTTTGA 


AGTTGAAGGC 


TGACTTCCGG 


CCGATTGGGC 


TGTATGCCGG 


1200 


GGTTGTCGTC 


GCCCCGGGGC 


TCGGGGCCCT 


ACCCGATGTC 


GTTCGATTCG 


CCGGACGGCT 


1260 


TTCGGAGAAG 


AACTGGGGGC 


CTGATCCGGA 


GCGGGCAGAG 


CAGCTCCGCC 


TCGCCGTGCA 


1320 


GGATTTCCTC 


CGTAGGTTAA 


CGAATGTGGC 


CCAGATTTGT 


GTTGAGGTGG 


TGTCTAGAGT 


1380 


TTACGGGGTT 


TCCCCGGGTC 


TGGTTCATAA 


CCTGATAGGC 


ATGCTCCAGA 


CTATTGGTGA 


1440 


TGGTAAGGCG 


CATTTTACAG 


AGTCTGTTAA 


GCCTATACTT 


GACCTTACAC 


ACTCAATTAT 


1500 


GCACCGGTCT 


GAATGAATAA 


CATGTGGTTT 


GCTGCGCCCA 


TGGGTTCGCC 


ACCATGCGCC 


1560 


CTAGGCCTCT 


TTTGC 










1575 



(2) INFORMATION FOR SEQ ID NO; 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 874 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: UNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I™ SENSE : NO 

(vi) ORIGINAL SOURCE: 

(C> INDIVIDUAL ISOLATE: Tashkent strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CGGGCCCCGT ACAGGTCACA ACCTGTGAGT TGTACGAGCT AGTGGAGGCC ATGGTCGAGA 60 

AAGGCCAGGA TGGCTCCGCC GTCCTTGAGC TCGATCTCTG CAACCGTGAC GTGTCCAGGA 120 

TCACCTTTTT CCAGAAAGAT TGCAATAAGT TCACCACGGG AGAGACCATC GCCCATGGTA 180 
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AAGTGGGCCA GGG CAT TTCG GCCTGGAGTA AGACCTTCTG TGCCCTTTTC GGCCCCTGGT 24 0 

TCCGTGCTAT TGAGAAGGCT ATTCTGGCCC TGCTCCCTCA GGGTGTGTTT TATGGGGATG 300 

CCTTTGATGA CACCGTCTTC TCGGCGCGTG TGGCCGCAGC AAAGGCGTCC ATGGTGTTTG 360 

AGAATGACTT TTCTGAGTTT GACTCCACCC AGAATAATTT TTCCCTGGGC CTAGAGTGTG 420 

CTATTATGGA GAAGTGTGGG ATGCCGAAGT GGCTCATCCG CTTGTACCAC CTTATAAGGT 48 0 

CTGCGTGGAT CCTGCAGGCC CCGAAGGAGT CCCTGCGAGG GTGTTGGAAG AAACACTCCG 54 0 

GTGAGCCCGG CACTCTTCTA TGGAATACTG TCTGGAACAT GGCCGTTATC ACCCATTGTT 600 

ACGATTTCCG CGATTTGCAG ■ GTGGCTGCCT TTAAAGGTGA TGATTCGATA GTGCTTTGCA 660 

GTGAGTACCG TCAGAGTCCA GGGGCTGCTG TCCTGATTGC TGGCTGTGGC TTAAAGGTGA 720 

AGGTGGGTTT CCGTCCGATT GGTTTGTATG CAGGTGTTGT GGTGACCCCC GGCCTTGGCG 780 

CGCTTCCCGA CGTCGTGCGC TTGTCCGGCC GGCTTACTGA GAAGAATTGG GGCCCTGGCC 840 

CTGAGCGGGC GGAGCAGCTC CGCCTTGCTG TGCG 87 4 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 449 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: Clone 406.4-2 cDNA 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 2. .100 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

C GCC AAC CAG CCC GGC CAC TTG GCT CCA CTT GGC GAG ATC AGG CCC 4 6 

Ala Asn Gin Pro Gly His Leu Ala Pro Leu Gly Glu lie Arg Pro 
1 5 10 15 

AGC GCC CCT CCG CTG CCT CCC GTC GCC GAC CTG CCA CAG CCG GGG CTG 94 
Ser Ala Pro Pro Leu Pro Pro Val Ala Asp Leu Pro Gin Pro Gly Leu 
20 25 30 

CGG CGC TGACGGCTGT GGCGCCTGCC CATGACACCT CACCCGTCCC GGACGTTGAT 150 
Arg Arg 

TCTCGCGGTG CAATTCTACG CCGCCAGTAT AATTTGTCTA CTTCACCCCT GACATCCTCT 210 
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GTGGCCTCTG GCACTAATTT AGTCCTGTAT GCAGCCCCCC TTAATCCGCC TCTGCCGCTG 270 

CAGGACGGTA CTAATACTCA CATTATGGCC ACAGAGGCCT CCAATTATGC ACAGTACCGG 330 

GTTGCCCGCG CTACTATCCG TTACCGGCCC CTAGTGCCTA ATGCAGTTGG AGGCTATGCT 390 

ATATCCATTT CTTTCTGGCC TCAAACAACC ACAACCCCTA CATCTGTTGA CATGAATTC 44 9 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Ala Asn Gin Pro Gly His Leu Ala Pro Leu Gly Glu lie Arg Pro Ser 
1 5 10 15 

Ala Pro Pro Leu Pro Pro Val Ala Asp Leu Pro Gin Pro Gly Leu Arg 
20 25 30 

Arg 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 130 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: Clone 406.3-2 

(ix> FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 5.. 130 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GGAT ACT TTT GAT TAT CCG GGG CGG GCG CAC ACA TTT GAT GAC TTC TGC 4 9 

Thr Phe Asp Tyr Pro Gly Arg Ala His Thr Phe Asp Asp Phe Cys 
15 10 15 

CCT GAA TGC CGC GCT TTA GGC CTC CAG GGT TGT GCT TTC CAG TCA ACT 97 
Pro Glu Cys Arg Ala Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr 
20 25 30 

GTC GCT GAG CTC CAG CGC CTT AAA GTT AAG GTT 130 
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Val Ala Glu Leu Gin Arg Leu Lys Val Lys Val 
35 40 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Thr Phe Asp Tyr Pro Gly Arg Ala His Thr Phe Asp Asp Phe Cys Pro 
1 5 10 15 

Glu Cys Arg Ala Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr Val 
20 25 30 

Ala Glu. Leu Gin Arg Leu Lys Val Lys Val 
35 40 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 406.4-2 epitope - Mexican strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Ala Asn Gin Pro Gly His Leu Ala Pro Leu Gly Glu lie Arg Pro Ser 
15 10 15 

Ala Pro Pro Leu Pro Pro Val Ala Asp Leu Pro Gin Pro Gly Leu Arg 
20 25 30 

Arg 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

• { C ) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 
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(ii) MOLECULE TYPE: peptide 



(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 406.4-2 epitope - Burma strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Ala Asn Pro Pro Asp His Ser Ala Pro Leu Gly Val Thr Arg Pro Ser 
1 5 10 15 

Ala Pro Pro Leu Pro His Val Val Asp Leu Pro Gin Leu Gly Pro Arg 
20 25 30 

Arg 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 4 0 6.3-2 epitope - Mexican strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Thr Phe Asp Tyr Pro Gly Arg Ala His Thr Phe Asp Asp Phe Cys Pro 
15 10 15 

Glu Cys Arg Ala Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr Val 
20 25 30 

Ala Glu Leu Gin Arg Leu Lys Val Lys Val 
35 40 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 2 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 
( iii ) HYPOTHETICAL : NO 
(iv) ANTI-SENSE: NO 
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(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 406.3-2 epitope - Burma strain 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

Thr Leu Asp Tyr Pro Ala Arg Ala His Thr Phe Asp Asp Phe Cys Pro 
15 10 15 

Glu Cys Arg Pro Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr Val 
20 25 30 

Ala Glu Leu Gin Arg Leu Lys Met Lys Val 
35 40 



111 



